Classification with Support Vector Machines

Midterm Exam Updates

  • Concept exams have been graded and released
  • The letter grade associated with your exam is your grade without revisions
  • You are permitted to submit one round of revisions to earn a higher grade
  • Revisions are due by Friday, December 15.

Take-Home Analysis

The data analysis portion of the exam will be graded no later than this Friday (November 17).

Midterm Reflections

  • Revisions must include thoughtful reflection on why your previous answer was incorrect and how your revised solution remedies these issues.

  • Revisions and reflections will be submitted to the Midterm Part 1 Grade & Revisions Canvas assignment.

Lab Revision Updates

  • Revisions on lab assignments will be accepted until Friday, December 15.

  • For the next three weeks, you are permitted to submit up to one (1) revision per week.

  • Remember, you are required to earn a “Satisfactory” on all but 2 assignments to earn a B in the course.

Plan for the Next Three Weeks

Week 8 – Support Vector Machines

  • Final Lab Assignment
  • Project Phase 2 (Introduction, Preliminary Analysis, Project Timeline)

Week 9 – Neural Networks


  • Final Activity
  • Project Phase 3 (Rough Draft)

Week 10 – Final Report

Maximal Margin Classifier

Federalist Papers

A Simple(r) Classification

Suppose we are interested in classifying a new observation as “John Jay” or “Not John Jay”.

Choosing a Line

There are many lines we could draw that split the training data perfectly between John Jay and not John Jay!

I’m just drawing any line that separates the two groups. Is there a better way to construct these lines?

Equidistant Between Observations

How should we choose between these two lines???

Maximal Margin

The “best” line is the line with the largest margin or is furthest from the nearest observation on either side.

  • Where could I add a John Jay essay which would not change the line?
  • Where could I add a John Jay essay which would change the line?
  • How many observations control the shape of the line?

Difficulties with Hard Margins


It is rare to have observations that perfectly fall on either side of a line / hyperplane.


Adding one more observation could totally change our classification line!

A Different Comparison

Suppose we wanted instead to separate “Hamilton” from “Not Hamilton”…

Where should we draw our line?

Soft Margin

A soft margin is a margin with only a certain number of misclassified points.

There are two decisions to make here:


  1. How big is our margin?

(M = width of margin)

  1. How many misclassified observations are we willing to have?

(C = cost of a misclassified point)

An Initial Attempt

Width of margin: 2

How many points in the margin are misclassified?

A Second Attempt

Width of margin: 1

How many points in the margin are misclassified?

Support Vector Classifier

The support vector is the set of all observations that falling within the soft margin that are misclassified.


A support vector classifier tries to find:

a line / hyperplane that will be used to classify future observations …

… that give us the biggest margin width (M)…

… while still respecting the cost of missclasified points (C).

Fitting a Linear Support Vector Classifier with tidymodels


svm_spec <- svm_linear(cost = 3, margin = 0.5) %>%
  set_mode("classification") %>%
  set_engine("kernlab")

fed_recipe <- recipe(author_ah ~ PC1 + PC2, data = fed_pca_ah)

fed_wflow <- workflow() %>%
  add_model(svm_spec) %>%
  add_recipe(fed_recipe)

my_svm <- fed_wflow %>%
  fit(fed_pca_ah)

Install a New Package

You will need to install the kernlab package this week!

Inspecting the Model Fit

my_svm %>% 
  extract_fit_parsnip()
parsnip model object

Support Vector Machine object of class "ksvm" 

SV type: C-svc  (classification) 
 parameter : cost C = 3 

Linear (vanilla) kernel function. 

Number of Support Vectors : 16 

Objective Function Value : -38.0067 
Training error : 0.057143 
Probability model included. 

Making Predictions

predict(my_svm, new_data = fed_pca_df)
# A tibble: 70 × 1
   .pred_class 
   <fct>       
 1 Hamilton    
 2 Not Hamilton
 3 Not Hamilton
 4 Not Hamilton
 5 Not Hamilton
 6 Not Hamilton
 7 Hamilton    
 8 Hamilton    
 9 Not Hamilton
10 Not Hamilton
# ℹ 60 more rows

Plotting Predictions

How well did the model do?

Model Metrics

predict(my_svm, new_data = fed_pca_ah) %>% 
  bind_cols(truth = fed_pca_ah$author_ah) %>% 
  rename(prediction = .pred_class) %>% 
  conf_mat(truth = truth, 
           estimate = prediction)
              Truth
Prediction     Hamilton Not Hamilton
  Hamilton           48            1
  Not Hamilton        3           18

Not Separable Groups

What if we simply couldn’t separate our data with a line / hyperplane?

Increasing the Dimensions

What if we imagine our points exist in three dimensions?

Support Vector Machine

A support vector machine is an extension of the support vector machine classifer that results from enlarging the feature space in a specific way, using kernels.

In this class, we will only implement polynomial kernals.

Try it!

Open 07-SVM.qmd to explore how SVM (and PCA) can be used to classify the class of different zoo animals.